Advanced partitions¶
This page documents the technical partition policy enforced by Slurm and job_submit.lua.
Partition policy¶
| Partition | Accepted job mode | Enforced GPU/GRES type | Default ntasks |
Default cpus-per-task |
CPU cap (ntasks * cpus-per-task) |
Default memory (DefMemPerNode) |
Max memory (MaxMemPerNode) |
Default time | Max time |
|---|---|---|---|---|---|---|---|---|---|
interactive10 |
srun |
gpu:nvidia_a100_1g.10gb:1 |
1 | 4 | 4 | 16G | 16G | partition default | 2h |
prod10 |
sbatch |
gpu:nvidia_a100_1g.10gb:1 |
1 | 4 | 4 | 15G | 15G | 4h | 24h |
prod40 |
sbatch |
gpu:nvidia_a100_3g.40gb:1 |
1 | 16 | 16 | 60G | 60G | 4h | 24h |
prod80 |
sbatch |
gpu:nvidia_a100-sxm4-80gb:1 |
1 | 32 | 32 | 120G | 120G | 4h | 24h |
What is QoS (Quality of Service)?¶
In Slurm, QoS (Quality of Service) is a policy profile attached to jobs/users/accounts. It is used to control scheduling behavior such as limits (for example maximum number of concurrent jobs), priority and preemption rules. On this cluster, QoS is used to enforce part of the job concurrency policy.
What job_submit.lua enforces¶
- A partition must be provided (
-p/--partition). - Missing
--gresis auto-filled from partition policy. - Missing
--ntasksis auto-filled to1. - Missing
--cpus-per-taskis auto-filled from partition policy. - CPU requests above partition policy are rejected.
- Memory defaults and caps come from partition
DefMemPerNode/MaxMemPerNode. prod*partitions reject interactive submissions (srun); usesbatch.- For jobs in QoS
normal(or empty QoS), at most 4 running jobs are allowed in total. - For jobs in QoS
normal(or empty QoS), at most 2 running jobs are allowed acrossprod40+prod80.
If your workload needs more resources or higher limits, contact support: dgx_support@listes.centralesupelec.fr.
Simplified vs explicit submissions¶
Simplified (recommended first):
srun -p interactive10 --time=00:30:00 --pty bash
sbatch -p prod10 --time=04:00:00 --wrap="python3 train.py"
Explicit (advanced override):
sbatch -p prod40 --gres=gpu:nvidia_a100_3g.40gb:1 --ntasks=1 --cpus-per-task=16 --time=08:00:00 train.sbatch
Use explicit settings only when you need to override defaults for a specific workload.
Technical rationale: memory is budgeted with system headroom on the DGX node (about 32G reserved), then split by GPU class for production partitions.